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The performance of a pattern recognition system is characterised 
by its error and reject tradeoff. This paper 4 escribes ah optimum 
rejection mle and presents a general relation between the error 
and reject probabilities and sene simple properties of the tradeoff 
la the optimum recognition system. The error rote con be, directly 
evaluated from the reject function* Some practical implications of 
the revolts are discussed, Examples in normal distributions and 
uniform distributions ate given. 
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The error rote [^nd the reject rate tre comntonly si^e-el 
to describe the per for iuan.ee level of pattern re co^ rut lot systems. 
An error of mi are cognition occur* when a pattern from grifi 
class is identified as that of et different class. The error tp 
sometimes referred to- us a substitution error or undetected 
error, A reject occurs when the recognition system withhold* 
its recognition decision, and the pattern is rejected for excep¬ 
tional handling, such as rescan or manual inspection. 

Because of unce rt a j nt L c £ and nnifco inhe rent in any 
pattern recognition ta.sk,. errors Eire generally unavoidable. 

The option to reject is introduced to safeguard against excessive 
rtiis re cognition; Et converts potential misr e-COgnition into 
rejection. However, thfi tradeoff between the errors and 
rejects is Sc id cm one foT one. Whenever the reject option 
is exercised, some would-be correct recognitions Eire also 
converted into rejects. Y/e are interested in the best errata 
reject tradeoff in Lite optimum rejection. scheme- 

An optimum rejection scheme was derived in Ref. 1. 

The error-reject tradeoff curves have been used to describe 
and compare the empirical performances of recognition methods. 



(o. g. kefs, 2 and i and they have also been found useful 
in the actual system design of an. optical page reader (Ref- 'll. 
However, few theoretic;.1 rcfcuils on llic error-reject trade¬ 
off a re available. 

This paper first describes an optimum rejection rule 
and then derives a general relation between the error and 
reject probabilities* The error rate can be directly evaluated 
from, th-e reject function. This result provide* a basis for 
calculating the error rates from the empirical rejection curve 
without actually identifying the errors. Some simple properties 
of the optimum tradeoff are presented* Examples in normal 
distributions and uniform distributions are given. 
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where v is Che pattern vector. n is the number of classes, 

(p,. P ?f , , , is the a priori probability (fi sir i but ton yf the 

c leases, F(v[i} if the conditional probability densi! y for v 
th 

given Hit i class,. d.(i f o) is the decision that v ja Identified 
th 

is of the : clz*a while d La the decision to reject,, and t its 
a constant between 0 and. 1 (0 ^ t * 1}, The probability of error, 
or error rate, is 



£ fdj 1 v) p,F{v i i)dv 


(5} 


and the probability of reject or reject rate, is 



n 

E p„ F(v | EJdv 

i c l 




where V is the pattern Space, Both the error and reject 
rates are implicit functions of the parameter t. 

The probability of Correct recognition is 



n 

2 6 (d. Mp.Ffv'iaJdv 

i=l 1 1 1 


t i „ E(t) - R(t) 


{1) 


i 
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anti the probability of acceptance {or acceptance rate} ie 
defined, as 


A{tJ = C(t) + E(C). 


(S) 
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.Re i o c t ion Threshold 

The ^rametci 1 t in the decision rule will be called 
"the rejection threshold", For any fitted value of t (0 £ l £ 1) 
the decision rule 5 partitions the pattern Space V into two 
disjoint sets (or regions) * V (c) and V ft) where equations 

■** it 

(2) and (3) respectively hold, namely; 

V ft) = {vjmjix [p Ffvji)] = fl-t) F(v)j £9) 

1 I. 

V (t) c {v 1 nr,.'US Cp.rfvSt)] < (1-t) F(v)} £10) 

-tV IL 1 

i where 

J ' 

FM-E p. F£v t i) * (H) 

1 

Without loss of generality, it will be assumed that 
F(v) la non re TO over the entire Space V r otherwise the set 
over which F(v) is aero is firet deleted. V and V are 

drti a?V 

tilled respectively the Acceptance region" and the "reject 
region* 1 of the decision rule- An example is depleted in 
Fig. Ifa} whore the shaded region is ar.d the unshaded 
region is V , . 

We shall now present M?ns simple propertied of the 


rejection threshold t; 
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{,;) both l ho crio:' and reject rate! ara mo not on it in t P 
(b) t is an upper bound of the cr^or rate, and 
fc) t is a differ jntial error-reject tradeoff rotio. 


(a) VLonoteriicity 

It follow s immediately from tbi 
and (1C) that for any t. and in u 0,l_ 


vy c V 9 “ d 


vv=w 


definitions of (9) 
Lf Cj < t ?1 then 


1 With the aid of equations {!) and {4) t (9) and (i0) f 
the various probabilities can be Written as 


aft) = J v *» dv 

R 

r 

Aft) = J v ^ {e) F < v > dv 


(6 r ) 




a r.-o 


Cft) = ~ y mi* [p 1 Ffvlt)]dv 


(7 1 ) 


EftJ.J .{T P F{vli)~max 
V A 1 i=l i 

[p. F{v [ i)j 3 d.v 




i 
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All the integrands in the above Integrals are non- 

negative hence if the domain of incog rat ion, espflfids, 

the value of integral incre ase a. More specifically, 

if t 1 < then V A ft^ c V A {t 2 } and V K (i^ = V^h 

thar-efore, E{t^ £ E{t^) and s R{t^}, 

In other words, 11 increases and R decreases with 

increasing t. In particular, when t = Qj E - 0 and when 

t - Is A = 1 and R = 0, Whenever t M - i, R - Q. 

n 

(b) A h_U? p< ;T Bound of E rror Rate 
We shall new sh&w that 

Eft) £ t * (12) 

For any v in V (t) r we have 

rt- 

Majc [ p. F{vjj i)] £ (1-t) F(v) + 
i 

Therefore, 

j* 

J. r f . Max [p.F(v[ij]dvi (l-s)JL ,.*Wv 

V A l 1 i A. [ * 

which, with (7'I and (&'}, is 

C{t) * (1-t) AH). 

Ht-ncc 

E{c) £ tA(t) £ i. 

(c) is s!;own in the following section. 
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■ v l 11' - R i. J L-^ t T rftjj e_of f 

A complete description of the perfoi mancc of recognition 
systems is given by the err or-reject tradeoff, i, c. , the functional 
relstion-C^ E and R. at ell level#, A typical tradeoff curve is given 
irt -Fig. 2. Since both E and R of the Optimum recognition 
systems ate mono?onic functions of the rejection threshold 
t, one can compute the tradeoff S vs, K from K{i) and R(t), 

We shall now show that the rejection function. R(t) alone 
suffices to completely characterise the optimum recognition 
performance. In other words, i] can be derived from R{t), 

Or from its inverse t(Rj. The central result is the simple 
functional relation between H and R,. namely 

ri 

E e- J t{FL)<IR„ (13) 

I R 

This relation is valid for all optimum decision rules 
as defined in Equations (1) - (4), No explicit forms for the 
density functions E(v ] 1) are required in deriving the integral 
relation: of {13}. However, it will be assumed for convenience 
in the foil Owing derivation that R(l) is differentiable with 
respect to t„ Under this assumption, the inverse function t(R} 
ia single-valued. However, this assumption will later be 


removed. 
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Consider an deer ;me ntal change in : ho reject ton limes- 
hold from : to- l - it; tlm reject region expands from Y^{1) to 
V„{t-At). Let AY^tt) denote the incremental region V (t-At} - 
Y ft}. For any v in AV (tJ „ it was accepted at the threshold 
i and is now rejected at the lower threshold l - At. .Equations 
{£) and (4)- now give: 


(l-t)JH s Max p.r(v]ij < fl-t + M)F{v) for v e£V (t) 

a 1 R 

1 


(HI 


Ry intiftg the last Oppression over the incremental 


i e gion A V , one obtain & 
R 


fl-t) AR < -*c< {1 - t + it) AR (IS) 

where AR and AC are respectively the increments in the 
rejection i-ate &nd Correct recognition rate, namely 

r 

AR-J Ffvjdv 

’ R 

r 

AC*J Max tp. F{v| i)]dv. 

a H ■ i 1 

Of courso, th<r increment in the error rate is simply 

j. 


AE ^ - AR - AC. 
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By .■ikiihStitutii'Lg (16) into (15), one ba<> 


-LiflSfiEC- (t-At) 6R 


At it At 


P7:ij 

t!7b) 


flirtte Hft) is differential It, AR - 0 as At - 5 and (17) yields 


_d£ _ dji 

dt dt 

i 


By Integrating (IS-) from, t = ft to t, one has 

t 


(IS) 


E(t) - E(0) 


■ [*£* 
j Q <A t 

R(t) 

t{R)dR 


■/ 


R(0) 


SLnce E(0) - 0 and R(D) = 1, th* above expr*Btian becomes 

E tdR, 


This relation is depicted in Fig, 3* Equation (13) can also 
be written, through an integration by j>Srl£ and as indicated 
in Fig« 3* as 

f 


£ *1 ^ R(t)dt - 


tR (t). 


(19J 
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Equo iion (IS) gives 


dE 

tiH 


-t £ 0 


(2d) 


The rejection threshold is the differential errOT-reject 
tradeoff, In particular, the initial slope df the error-rejeCE curve 
is -i f or greater while the final slope i a 0« EquutiQri 20 also gi ve a; 

dli s .il io [2D 

<* 2 « 


The optimum error-reject curve is always concave 
upward ar.d the slope increases from -Icq 0 as R in ct eases 
from 0 to 1. (Figure 2) 

Although the integral relation (13) is derived under :hc 
assumption Chat R (t> U differentiable, the assumption of 
differentiability is not ease Pitiable. For example, it suffices 
to assume Chat R(t) is continuous. A proof for (13) would start 
with (17a) instead of (17b). Actually fto assumption about R(t) 
is necessary for the validity of (13)- R(t) it,, of course, mono- 
tonic ar.d bounded function of t and is thus of bounded variation. 
Consequently, the Scieltjes integral t dftft) always exists. 

J C = G 

The error-reject integral is ill general 
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Another Proo f of the Ehrr or-Reject ^tgj; ra - 

Wc shall now present Another proof* of the ex7or-rCjcct 
integral of {13) with a hope to provide additional insight to the 
r elation:'" ^Let M(v) denote the random variable 

n3i p,F(v[i) 

MM = ——---— {2 3} 

F(vJ 


M(v} is She maximum of the a posteriori probabilities of the 
classes given the pattern V. Let ^(m) be the density function 
of M, fn. general g(m) is rather complex function of the undcr+- 
lying density functions Ffv'i}^ however, its explicit form dots 
not concern our proof here. In terms of the variable M(v)> 
the reject condition., {4) becomes 

MM * 1 - t {24} 


and the probability of reject 
) 



t 

g(rn) dm 


which alao gives 




dR{j - m) 

dm 


{J5} 


(Zb) 


•^This proof is due to MTh M. Heilman and Dr. J. Raviv of 
IS hi V^atson Research Center. 



Since g{m) :a non-regiltivc* R{4) is a rnoaotooie increasing 
function of tbe yppc jf lh-n.t of the integral of [Z5], or a mono- 
tynic decreasing function of C* -fly the definition of M(v), 
the probability of correct recognition for a givoo reject thres¬ 
hold t i $ , 

mg(m}dnn {’T) 



Sy substituting {?.&) in (£7) and integrating by parts* we obtain: 


r° 

C<t )*J t (1-t) dR(t) 


1 - (l~t)R{t) 


■f 0 aw*. 




The error rate is then 


E{t) =. 1 * JRLCt) - C{tj 

.t 


■/. 


= \ „ aft)dt - cR{t)< 

V 


which ia {19} -and, is equivalent to {13)* 
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R c ■ eel io n T hrcshold of ft Minimum Risk Rule 

It is known J that the optimum decision rule given in 

Equations (I) to {4) is also a minimum risk rule if the cost 

function i b uniform within each cla e s of decs s ions, i. e. if no 

distinction is made among the errors', among the rejects, 

and amenc the correct re cognition* The Tajoctiott threeho.d 
'“T 

13 then related to the costa a a follows; 


t - 




(29) 


where W , W , and W are the costs for making an error, 
e r c 

reject and correct recognition respectively. Usually W^_ > 

\V r > W * The rejection threshold la simply the normalized 
r c 

cast for the rejection. We can take W^ = 0 and s 1, &ud 
the minimum, risk is 


RUk {%) = E[t) + tR(t) = f R(t) dt (30) 

0 

which is also depleted in Fig, 3. 


Fot n.umcrical illu Etrnt Lon, two examples are given 
here, hi thee<* examples, tl:e pattern vse^O? v is one-dimen¬ 
sional and there are two pattern classes v-dtfr equal apriori 
probability of occur emce,, i.e, p, - p, = '■^-, Tho examples 

i riy 4 

respectively are concerned with the normal distributions and 

uniform distributions* 

Fox two classes^ the condition for rejection, namely 
£q.. (4)^ can never be satis fii?d whpn t > -~- , hence the reject 
rate is always zero if the reject thre ahold, t, exceeds y t The 
effective range oft for :wo Claeses of problems ia. the refore, 
from 0 to -r~. With n s 2 and 0 i J i it can be readily 

jL ji 

verified that the condition for rejection r {4J ia equivalent to 


t „ 

1 “ s _ p 2 Fiv|i} 



(31) 


(I) Normal Distribut ions of Equal Variance 

Consider two normal distribution* With means and 

2 

l-i and equal covariance c {Fig, 4). Take ^ > U . The 

«L» 1 - 4 


density function is, i - 1 or 2, 



Y/i i.k { 31 } and some algebraic manipu! Atione, {d£} can 


be transformed to 


|v-*— - tnl-i- - 1 ) 


Lt ^ U 

1 l 


L e- „ the optimum rule it to reject whenever the pattern lies 
within a certain distance of the midpoint between the two 
means. The corresponding error and reject rates are - 


E(tl = i {a) 


(34a) 


RU) = * (b) - * M 


{Mb) 


where $ U the normal cumulative distribution function oamely 


i (zJ vsr J 


and the parameters are 


1 1 J r 1 n 

■T*-T* nf r- 


4- a + — ^ 1 4- *) 

£ - a t 


{b} ( 36 ) 


a v= 
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Th-c parameter r iUic {norm-aliaed) separation between 
tin.- means of the distributions and is the only (composite) 
paiMiTieter of the distributions that R (t) and E{t) depend upon. 

It 5 s straightforward to verify (1&) and hence (13) for thi e 
example. A set of the error, reject, and tr adeoff turves 


{for 3 = 1, Zi 3, and 4) j« depicted in Fig. 5. 


Uniform distributions 

Consider two uniform distributions^ 


F{v|l) 

F(vU) 


I 1 when 0 s v £ 1 
0 elsewhere 

1 , 1 * * 5 

- 1 -'- when =- V s “z - 

2 2 2 

0 elsewhere 


{37a) 


{ 37 b) 


which are shown in Fig. 6. 

The reject function R(t) is simply; 

h i 

( — when O^s- 

(33) 

0 wben-~< t s 1 


which is discontinuous {Fig. 7a) and the integral of {£1} is 
t 

evaluated to 
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Some Practical implication* 

M.ost of our re sails on the errai-reject tradeoff seem 
consistent with our intuition, although the simple integral 
relation between the and reject rates is somewhat unex¬ 

pected. These results rave sfime practical implications are 
are useful in, system design and performance evaluation, 

Since the slope o£ the e r ro r - rej e et tradeoff curve {Fig* 2} 
is the value of the rejection threshold, the tradeoff is most 
effective initially (i* c. at the low level of rejection} and it gets 
harder as the error rate is lower* This it certainly cSmftiM irt 
our practical experience;; excessive rejection is generally 
required to reduce residual errors. 

Practical application* of the present results are in the areas 
of ■system design and performance evaluation of the recognition 
systems. The general characteristics of the error-reject trade- 
off Curve provide the system designer a convenient means of 
verifying the basic assumption on the underlying probability 
distributioni. The integral of (I 3} makes it possible to calculate 
error rates, ar.d consequently the tradeoff Curve from the empir¬ 
ically observed reject rates* Ho class identification of the sample 
patterns are required ir. obtaining the empirical rejection 
curve,j Or equivalently one can just Obtain an empirical density 
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finurtioif^'thfc maorimum of tbc a postcifjioii p-robabiUthte, and 

jT 

Uicrt calculate the (irfor atid reject rates usis^ equations, (25) 
anti (.271* 

In most recognition tasks,, the underlying probability 
distributions of the patterns are not completely known and the 
design of the recognition system a is generally based on empirical 
data. A common design procedure is to assume, on the basis 
of available (usually limited} a priori information and the designer's 
intuition* tome functional forms of the distributions and to derive 
the system structure based on these assumptions and CO adjust 
the system parameters by using the empirical data* It is not 
always a simple matter to verify the validity o£ the assumptions 
on which the ay stem structure is based. However* one can 
always, though laboriously, obtain Che empirical error-reject 
tradeoff curve and compute the theoretical one from the basic 
assumptions. A comparison of tee empirical and theoretical 
tradeoff turves can quickly reveal how well the theoretical model 
agrees with the empirical data and serve as a checkpoint for 
initiating the process of revising and improving the theoretical 
model* 

The data used in any meaningful evaluation of a recag* 
r.ition. system is usually large and it is extremely cOstiiy and 
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Cone Lll si on 

A general cr^oj 1 and reject tradcoff relation is derived 
for th<* (Bayc s) opt?mum recognition system with an option, to 
reject. -rhe error probability is a SticlLjoE integral of the 
reject Sons threshold with respect to the reject probability. 

The error fu. tic* ion. can be directly evaluated from the reject 
function. Ho nee, the reject function determine!; the recog¬ 
nition error and reject tradeoff and completely characterises 
the performance of the optimum recognition system. 

Same practical implications in the system design and 
performance evaluation of the recognition systems are d:s- 
■cussed* The error-reject integral provides a simple means 
of calculating the error rate from tho empirical reject curve 
without actually identifying the recognition ottocs. 



25, 


time consuming to detect the recognition i trots. To identify 

,1 recognition error, addition^; information usually human 
j . - 

inspection at some stflgCj is required. On the other hand, 
the rejection is Eho explicit result of a definite decision, 
and the rejects can be readily recorded ard tallied. Equation, 
ill) provider a Simple means of calculating the error rate 
from the reject curve without actually identifying ibe errors. 
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figure captions 


Reject Regions in the Pattern Space 
Err or-Reject Tradeoff Carve 
Reject Curve 

Example in Normal Distribution 

Npi'.^yial Distributions:: (a) Reject and Error Curve* 
{b) Tradeoff Curve 

E* a arable jn Uniform Distributions 

Uniform Distributions: (a) Reject Co.a"ve (b} Error 
Curve (cj Tradeoff Curve 
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